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Abstract 

The notion of distinguishability between quantum states has shown to be fundamental in the 
frame of quantum information theory. In this paper we present a new distinguishability criterium 
by using a information theoretic quantity: the Jensen-Shannon divergence (JSD). This quantity 
has several interesting properties, both from a conceptual and a formal point of view. Previous 
to define this distinguishability criterium, we review some of the most frequently used distances 
defined over quantum mechanics' Hilbert space. In this point our main claim is that the JSD can 
be taken as a unifying distance between quantum states. 
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I. INTRODUCTION 



The problem of measurement is an issue of central importance in quantum theory, that, 
since the pioneering days of the twenties has given rise to controversies P]. Many of the 
most astonishing results of quantum mechanics are related to the particular properties of 



the measurement processes. In recent years, the unique character o: 



quantum measurement 



From a formal point 



has led to a new field of research: quantum information technology 
of view, a measurement in quantum theory is described by means of an Hermitian operator. 
If the eigenstates of this operator are and the state of the system to be measured is 
|\E') = X]cfc|0fc), then, according to the axioms of the quantum theory, the result of the 
measurement will, with probability |cfcp, be the corresponding eigenvalue at-, represented 
physically by an appropriate state of the measuring device A. 

A close related theme is that of the distinguishability between states, that is, just how can 
we discern between two states |\t''^-'^^) and |\E''^^)) of a given physical system by using the mea- 
suring device A. In a seminal paper, Wootters investigated this problem and introduced a 



"distinguishability-distance" between pure states in the associated Hilbert space^ 



0. 



|. Braun- 

stein and Caves extended this distance to density operators for mixed states [5]. Wootters 
distinguishability-criterium can be established, within the framework ofprobability theory 
(independently of any quantum interpretation), in the following way [4]: two probability 
distributions, say, p^^^ = {pi,P2, ■ ■ ■ ,Pn) and p^"^^ = (gi, q2, . . . , Qn) are distinguishable after 
L trials {L oo) if and only if the condition 

^ i=i 

with 6pi = Pi — Qi, is satisfied. This distinguishabihty-criterium involves a distance defined 
over the space of probability distributions 

= i /^M!. (2) 

Statisticians call to the square of this form the distance. Wootters maps this distance 
into the associated Hilbert space and establishes a correspondence with the usual notion of 
distance between states in Hilbert's space. 

In addition to its relevance with regards to the distinguishability issue, the concept of 
distance between different states in a Hilbert space plays an important role in a diversity of 



circumstances 

• the study of the geometric properties of the quantum evolution sub-manifold 

• in discussing squeezed coherent states or generahzed coherent spin states Q], 

• in ascertaining the quality of approximate treatments j^. 

It has recently been recognized that the concept of distinguishability is basic to manipu- 
late information in the sense that being able to discern between different physical states of a 
given system allows one to determinate just how much information can be encoded into that 
system, so that the notion of distinguishability builds a bridge between quantum theory and 
information theory P|. 

In this work we will try to strengthen this connection by investigating the relation between 
Wootters' distance and a suitable metric for the probability-distributions' space that is used 
in information theory: the Jensen- Shannon divergence (JSD). Recently, the JSD has been 
exhaustively studied in different contexts It has many interesting interpretations, both 
in the framework of information theory as in the context of mathematical statistics. One of 
its basic properties is that its square root is a true metric in the probability-distributions' 
space, i.e., its square root is a distance that verifies the triangle inequahty jUj]. This fact 
is quite relevant, since metric properties are crucial for the application of many important 
convergence theorems that one needs when iterative algorithms are studied. 

The purpose of this paper is twofold: 

1. first, we pursue a pedagogical objective by reviewing some distances and metrics com- 
monly used in quantum theory. Even though many of the results presented here are 
known, they are not always presented from an unified perspective, at least in physics 
literature, 

2. second, we formulate a distinguishability criterium for quantum mechanics based on 
the JSD. 

Finally, some conclusions are drawn. 
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II. A PRIMER ON HILBERT SPACE DISTANCES 



Let ...|0Ar) be the eigenstates of a given Hermitian operator associated with the 
measuring instrument A. For simphcity's sake we assume that no degeneration exists. Thus, 
in a given measurement possible results may ensue. If we have prepared the system in the 
(normalized) state I^E'*-^-*), each of these results can be found with probability |(0j|\E'''^'')p. If 
we prepare it, instead, in the state I^E'*-^-'), this probability is | (0j|\I'*^^^) p. Since the basis 
is complete 

i 

one has 

i i 

Let us write 

# = m\^''^)\' (5) 

An alternative way of looking at things is as follows. Let 

X+ = {(pi, . . . < < 1; = 1} (6) 

i 

be the set of discrete probability distributions (generalization to continuous ones being 
straightforward) and let S be the set of normalized states in the Hilbert space Ti""*"^, n+1 = 
N. To each states |\E') in S (indeed to a ray A|\E'), A = e*'^) we assign an element {pi} of 
Xjj- through the application JF4 given by: 

1^) ^ fe} such that Pi = \{(j)i\-^)\'^. (7) 

Obviously, the application JF4 is consistent with expressions (0} and (0). 

Let sxip^^\p^'^^) be a distance defined on the space of probability distributions Xj^, 
that is, an application from Xjj x X^ into 3? such that is symmetric and Sxip^^\p^'^^) = 
if and only if p^-^^ = p^'^\ One can associate to sx{p^^\p^'^^) a distance in the space 
7Y"^^, s^(|\E''^^^), |\E'*^^))) through the application Let us note that this distance depends 
upon the measuring instrument A. Our objective is to find a representative distance of 
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sxip^^\p^'^^) in Hilbert's space independently of the basis \(f)k)- This will be attained by 
looking for the maximum of the associated distance s^. We discuss some examples below. 
The pertinent distances are given proper names (e.g., Wootters), according to common 
usage. 

Notation remark: We will use the following notation: sx denotes a distance defined over 
X^; denotes the corresponding distance over TY"'^^ obtained from the correspondence 
induced by application JF4; S-h denotes the maximum of s^. 

A. Wootters distance 

The Wootters distance between two probability distributions, p'^^^ and p*^^^ is defined as 

s^{p^'\p^'^) = arccos(5^ \f^^). (8) 

i 

When p^^^ p^'^^ , the form Q is reobtained. 
By using the correspondence ((Zj), we can write 

= arccos(5^ (9) 

i 

Note that arccos(x) decreases in [0, 1]. Also, the following inequality 

5^|(0,|M/«)||(0,|M/(2))|>|(M/«|M/(2))|, (10) 

i 

is true for all Indeed, assume |\1'^^^) = '^i^cbk\(pk), and |^^^^) = Xlfc^fcl^fc)- Then, 

|(vl>a)|v^(2))| = \ J2akbl\<J2\<'kbl\ 

k k 

< EK'^'^I*^'^)IK<^'^I^^'^)I- (11) 

k 

Inequahty (fTTjl . together with the aiccos —function decreasing nature, implie that the dis- 
tance 

^^(1^(1)), 1^(2))) = arccos(|(^«|^(2)^|^^ (^2) 

maximizes s^''^. In this way we arrive at the distance associated to the Wootters' one in 

Hilbert's space. Geometrically, it gives the angle between the two states (rays) |\1''^^)) and 
1^(2)). 
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B. Hellinger' distance: 

Let Sx be a distance in such that its square reads 



s 



H\2 
X) 



Its Ti"""*"^— counterpart satisfies 

(4'^)^ = ^E{K<^^I^^'^)I-K<^^I^^'^)I}'' (14) 

i 

that can be cast as 

l-5^|(0.|vl/«)|(0.|v[/(2))|. (15) 

i 

We see that, according to the inequahty (fTTH) . the distance 

is the maximum of the associated distance Sx- It is known as Helhnger-distance and it 
represents the sine of the half angle between the two Hilbert space vectors \^!^^^) and 



121. 



C. Bhattacharyya' distance 



Another distinguishability measure arises from Bhattacharyya coefficients. For two prob- 
ability distributions p^^^ and p^'^\ the Bhattacharyya coefficients are defined by [l^ 



S(p«,p(^)) = 5^VpfVpf) (17) 

i 

Out of these coefficients we can define a distance between probability distributions: 

4 = -\n{B{j>^^\p^^^)). (18) 

Note that the Wootters' distance can be also expressed in terms of the coefficients B{p^^^ , p^"^^) 
as = arccos(i?(p*-^-',p*^^^)). It is worth mentioning that neither Wootters' nor 

the distance ()18|) are metrics because they do not verify the triangle inequality. 
The associated distance to (fTHjl in Hilbert's space is 

4'' = -i^Ei('^«i^^'^)ii(<^^i^^'^)i- (19) 



Now, since the function — ln(x) decreases with x, on the basis of ()10|) we gather that 

S:^{\¥^^),\¥^^)) = -ln|(^«|^(2)^|^ (20) 

is the maximum of Bhattacharyya's distance. 

In these examples we focused attention upon the maximums. Also, we have been able to cast 
all these distances as a function of a Riemannian Hilbert-space metric: an "angle" between 
rays, the only one that remains invariant under the action of the time-evolution unitary 
operator. 



D. Fubini-Study's metric 

Let us recall that the Hilbert space ?-^"+^ is isomorphic to the n-dimensional complex 
projective space P", that is, the quotient space 

jyn _ _ |Q|y _ ^ (21) 

with ~ the equivalence relation given by 

IV") ~ l</>) iff BXeC-Osuch that {tfj) = X\(f)). (22) 

In this example we start with a Ti"^^— distance and construct one in (previously we 
proceeded in reverse fashion). In V"' one defines the Fubini-Study metric Bps according to 

For lip) ~ 10), one has Bps = 0. Maximum separation between two states is attained for 
9ps = TT. Let i) S C be the set of normalized states in while ii) I?/)) and I?/)) + \d%lj) 
are two very close states in S. Normalization implies 

2i?e((V'|#)) = (24) 

jYiom (j2Sl), by putting \vi) = + \dip), we can evaluate the Fubini-Study distance between 
two infinitely close states: 

cos^(^) ^ (1 - i(^) + ....f ^ 1 - (25) 

so that 

deis = md^m-\{^m\'). (26) 



If \dip±) = \dip) — \ip){ip\dip) is the orthogonal projection onto \ip) of \dip), the Fubini-Study 
metrics acquires the aspect 

dels = A{dtlj±\dt/j±). (27) 

An alternative approach to the Fubini-Study metric can be found in reference 0|. 
Assume now the following expansions for \ip) and \ri) = \ip) + \dip): 

i 

\V) = 5^ VpTf^l^^), (28) 

i 

noticing that one might add appropriate phases in both equations. These phases, however, 
can be eliminated by a proper basis-transformation (see reference j^). The Fubini-Study 
distance between these states, up to second order in dpi becomes 

Pi 



deism m) = \Y.^^- (29) 



which can be thought as the corresponding Fubini-Study metric between the distributions 

{pi} and {pi + dpi] over the space X^. 

III. JENSEN-SHANNON DIVERGENCE 

Information theoretic measures allow one to build up quantitative entropic divergences 
between two probability distributions. A common entropic measure is the KuUback-Leibler 
divergence: 

sWJ'') = Y.P'i'^^\) (30) 

i Pi 

This distance, however, is i) not symmetric, ii) unbounded, and iii) not always well de- 
fined. To overcome these limitations Rao and Lin introduced a symmetrized version of the 
KuUback-Leibler divergence the Jensen-Shannon divergence (JSD), which is defined as 

(p(^),p(^)) = H{^^-^^) - i/7(p«) - \h{p^% (31) 

where H{p) = — ^.pjlnpj stands for Shannon's entropy jl^-jisj]. 

The minimum of the JSD occurs at p^^^ = p^^-* and its maximum is reached when p^^^ and 
p*^^^ are two distinct deterministic distributions. In this case sjf = ln2. As it was mentioned 
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previously, one of the JSD main properties is that of being the square of a metric. A proof of 



this fact can be found in reference 111. Alternatively, this can be proved starting from some 

n n 

classical results of harmonic analysis due to Schoenberg [la|,[12l|. The basic property of the 
JSD that makes Schoenberg theorem applicable is that sjf is a definite negative kernel, that 
is, for all finite collection of real number {Ci)i<N and for all corresponding finite sets {xi)i<N 
of points in X]^, the implication 

N 



5^ C = ^ 5^ COsf (a^., Xj) < (32) 

i hj 

is valid [18|. 

Another consequence of Schoenberg's theorems is that the metric space {X^, a/ s'jf) can 
be isometrically mapped into a subset of a Hilbert space. This result establishes a connection 
between information theory and differential geometry which could have interesting 
consequences in the realm of quantum information theory. 

Consider once again the states {ip) and \ri) given by (j^Hj) in order to evaluate the JSD 
between the concomitant probability distributions p^^^li')), P^'^Klv))- By doing so we are 
evaluating the associated distance in Hilbert's space s^''^ between the states \-ip) and \r]). 
Expanding the pertinent JSD in dpi-teims, one easily ascertains that the first non-vanishing 
contributions are the quadratic ones 

dsif'\mAv)) = lj:^^ (33) 

O Pi 

I 

which coincides with (a half of) the Fubini-Study (j^^ instance up to this order in dpi. Up 
to same order a similar relation exits between the JSD and both the Wootters' and the 
Bhattacharyya' distances, that is 

dsif^^ = lids^^y = i«'^)^ (34) 

which can be easily checked by inspection. Incidentally, it is worth mentioning that, when 
we have a continuous probability distribution p{x), the JSD between p{x) and its shifted 
version p{x + 6) is related to the Fisher information measure / through the expression 

sf(p(x),p(a; + 5))^^y| (35) 

with 

/rdp{x)_i2 
^^^^ (36) 



0.35 




X 



FIG. 1: Plots of sjf and See text for details 

Equations ()34p have been established up to second order in dpi. Let us proceed to higher 
orders. To do this let us consider a binary system (a generalization to a system with a 
greater number of states is straightforward). Let p^^^ = {p,q) and p*-^-* = {p + dp,q — dp) 
with p + q = 1 two neighboring probability distributions and evaluate the pertinent JSD up 
to order dp'^. We get 

, js 1 1 J 2 1 2p-l 3 7 3p2-3p+l 4 ... 

= -8 (^'^ + - T92 fip-ir '^ + ^(^^ ^''^ 

In turn, the corresponding Wootters' distance squared, up to the same order is 
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We detect coincidence between (|H7j) and (|HHj) up to order dp^. The fourth order difference 
equals In other words, the relation 

dsf = lids^r (39) 

can be established up to third order in dp. Figure 1 shows how s'x and ^ approache one 
to each other for p^^^ ^ p^'^^ . We took p^^^ = (a, 1 — a) and p^"^^ = (6, 1 — b) and evaluated 
the corresponding distances as a function of b by fixing a = 0.5. 

Going back to Wootters' distinguishability criterium (Q), with equation ()39|) in mind, we 
are in a position to enunciate an alternative criterium: two probability distributions P^^^ 
and P'-^-* are distinguishable after L trials {L oo) if and only if 



,JS(p(l)^p(2)))l/2> 1 



^2L 

There exist formal arguments in favor of this last statement, namely i) {sjfY^'^ is a true 
metric for the space and ii) this criterium is established in terms of an information 
theoretic quantity, the JSD. Obviously inequality (jlOj) is equivalent to inequality ((T)) for two 
distributions "close" enough. 

In the context of section II the following question emerges: what metric is the representa- 
tive of sjf in Hubert's space Equivalently: what is the maximum of the metric s 



JS,A 
H 



? 



In this case it is difficult (or impossible) to obtain an analytical expression for both metrics, 
s^"^ and its upper bound S^^. Anyway, it is possible to deduce an upper bound for s^. 
Let us consider a Hilbert space of dimension 2D and let |\I'^^)) and |\I'*^^^) be two arbitrary, 
normalized states (the extension to a greater number of dimensions is straightforward). We 
set |(\E''-^''|\E'*^^))| = cos(y9 for ip€[0, ti/2], that is, is the Wootters distance between |\I'^^)) and 

1^(2)). 

Let {\(f)i)}f^i be an orthonormal basis for Ti^. Any other orthonormal basis {\(f)i)}i=i can 
be related to via the rotation 

\Ue)) = y^l<^i) + 7^l<^2) 

ie -ie 

\M0)) = -;7^l'^i) + 7^l'^2) (41) 

with fe[0,27r]. We set p[^'> = | p and pp^(^) = |(0i(^)|^(^'))|2. Also, = 
/p{j)^%a\^^ (via application of ((7j)). A little algebra then leads to 

.(1) , ^(1) 



P?i0) = t^' + Jp?P? cos(2^ + a« - a«), (42) 
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FIG. 2: ^2si^'^(p(2),_p(i)) as a function of 9 ionp = 0.5 and ip = 0.8 

and 

(2) I (2) I 

P?ie) = ^-^^ + /pS^ cos(2^ + a? - af\ (43) 

with are real numbers. Moreover, pg^-* = 1 — p^^^ y pg^-* = 1 — pf^ . Without loss of 
generality we can take |0i) = |^*'^^), so that pj^-* = 1, a^^^ — 0, — ^ = cost/? y 

— sin(/7. Thus, we can compute \J 2s^'^''^(p(^),p(^)) as a function of ^. Figure 2 plots 

such a function for different (^—values. Figure 3 depicts a 3D-plot of sj 23^'^^ as a function 
of ^ and (p. In both cases we put — af"* — 0^2'' = 0. 

Out of these figures we conclude that Wootters' distance (</?) is an upper bound to 
Y 2s^'^''^(p(2),p(i)). For (/? — > 0, both quantities tend to coincide. In other words, we can 
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state the inequalities 

^^(|^(^)), 1^(2))) > s^'-^d^d)), 1^(2))) > ^24^-^(|^«),|^(2))) (44) 

for any measure device A. 

Inequalities pijl allow us to conclude that S^{\'^^^''), |\E'(2)^) "represents" (as the maxi- 
mum, that is as the lowest upper bound) to y^Ss^^ in the Hilbert space. Furthermore two 
states distinguishable under the "Jensen-Shannon criterium" are obviously distinguishable 
under the Wootters' ones. 



IV. CONCLUSIONS 

We have proposed an alternative distinguishability criterium for quantum states. This 
distinguishability criterium is established in terms of an information theoretical quantity: the 
JSD, that exhibits many interesting properties, such as a metric character and its bound- 
edness. This provides for a better formal context. In some sense we feel that the JSD 
divergence could be taken as a unified measure of distinguishability in the framework of 
quantum information theory. 

In the present work we focused on the case of pure states. An extension to mixed 
states can be easily attained. In fact, by replacing in eq.(jSI]) the Shannon entropy by the 
von Neumann entropy, Hj^{p) = — Tr(plnp), we can evaluate the JSD between two states 
described by the density operators pi and 

S^'iPu P2) = H^{^^) - In^iPi) - Ih^{P2) (45) 
Remarkably, this quantity is always well defined unlike the corresponding KuUback-Leibler 

n 

divergence that requires that the support of pi is equal to or larger than that of p2 A 
more detailed study of the properties of JSD for mixed states will be presented elsewhere. 

Finally it is worth to mention that the JSD can be also interpreted in a Bayesian proba- 
bilistic sense. In fact, the JSD gives both lower and upper bounds to Bayes' probability error. 
Therefore, it deserves careful scrutiny in the light of some alternative quantum descriptions 
21|. 
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0.8 



FIG. 3: 2,D Plot of \/2s:^''^{p(^\pW) as a function of 9 and (p. One clearly appreciates the bound 
in the plane z = ip. 
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